Skip to content

[ENH] Support regex for local Chroma #4527

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 27, 2025

Conversation

Sicheng-Pan
Copy link
Contributor

@Sicheng-Pan Sicheng-Pan commented May 12, 2025

Description of changes

Summarize the changes made by this PR.

  • Improvements & Bug fixes
    • N/A
  • New functionality
    • Support regex filtering for local chroma

Test plan

How are these changes tested?

  • Tests pass locally with pytest for python, yarn test for js, cargo test for rust

Documentation Changes

Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?

Copy link

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

Copy link
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@Sicheng-Pan Sicheng-Pan marked this pull request as ready for review May 12, 2025 18:03
Copy link
Contributor

propel-code-bot bot commented May 12, 2025

Add Regex Filtering Support for Local Chroma

This PR introduces support for regular expression (regex) filtering in local Chroma collections, enabling users to perform regex-based document retrieval. It updates both backend SQLite integration (via Rust) to support the REGEXP operator, modifies configuration to enable SQLite regex, and provides corresponding unit tests to verify correct functionality.

Key Changes:
• Implements REGEXP-based filtering in local Chroma by updating the query logic in rust/segment/src/sqlite_metadata.rs.
• Enables SQLite's REGEXP operator in the config initialization (rust/sqlite/src/config.rs) and specifies the required sqlx feature in rust/sqlite/Cargo.toml.
• Updates dependencies and lockfile to support regex.
• Adds unit tests in chromadb/test/property/test_filtering.py to verify correct regex behavior in document filtering.

Affected Areas:
• Rust SQLite backend query logic
• SQLite database configuration and feature flags
• Python property-based/document filtering tests
• Cargo.toml and Cargo.lock dependency management

This summary was automatically generated by @propel-code-bot

@sanketkedia
Copy link
Contributor

Can you add a basic test for this?

@Sicheng-Pan Sicheng-Pan requested a review from HammadB May 12, 2025 22:37
Copy link
Collaborator

@HammadB HammadB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me, is regex impl here a full scan?

@HammadB
Copy link
Collaborator

HammadB commented May 19, 2025

Please add basic tests at the python side, also does js client support this?

@Sicheng-Pan
Copy link
Contributor Author

Yes this will be a scan on the document index

@Sicheng-Pan Sicheng-Pan force-pushed the sicheng/05-12-_enh_support_regex_for_local_chroma branch from 1dc0d1f to 07927af Compare May 27, 2025 18:45
@Sicheng-Pan
Copy link
Contributor Author

Please add basic tests at the python side, also does js client support this?

JS client support will be added in the new client impl

@Sicheng-Pan Sicheng-Pan merged commit 8409cc4 into main May 27, 2025
72 checks passed
Copy link
Contributor Author

Merge activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants